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Abstract 



For any q > 1, let MOD 9 be a quantum gate that determines if the number of l's in the input 
is divisible by q. We show that for any q, t > 1, MOD g is equivalent to MOD t (up to constant 
depth). Based on the case q = 2, Moore ||] has shown that quantum analogs of AC^°\ ACC[g], and 
ACC, denoted QACjJ}, QACC[2], QACC respectively, define the same class of operators, leaving 

q > 2 as an open question. Our result resolves this question, proving that QAC^]. = QACC[g] = 
QACC for all q. We also develop techniques for proving upper bounds for QACC in terms of 
related language classes. We define classes of languages EQACC, NQACC and BQACCq. We 
define a notion of log-planar QACC operators and show the appropriately restricted versions of 
EQACC and NQACC are contained in P/poly. We also define a notion of log-gate restricted QACC 
operators and show the appropriately restricted versions of EQACC and NQACC are contained in 
TC^ ). To do this last proof, we show that TC(°) can perform iterated addition and multiplication 
in certain field extensions. We also introduce the notion of a polynomial-size tensor graph and we 
show that families of such graphs can encode the amplitudes resulting from applying an arbitrary 
QACC operator to an initial state. 



1 Introduction 



Advances in quantum computation in the last decade have been among the most notable in the- 
oretical computer science. This is due to the surprising improvements in the efficiency of solving 
several fundamental combinatorial problems using quantum mechanical methods in place of their 
classical counterparts. These advances led to considerable efforts in finding new efficient quantum 
algorithms for classical problems and in developing a complexity theory of quantum computation. 

While most of the original results in quantum computation were developed using quantum 
Turing machines, they can also be formulated in terms of quantum circuits, which yield a more 
natural model of quantum computation. For example, Shor has shown that quantum circuits 
can factor integers more efficiently than any known classical algorithm for factoring. And quantum 
circuits have been shown (see Yao [16]) to provide a universal model for quantum computation. 

In the classical setting, small depth circuits are considered a good model for parallel computing. 
Constant-depth circuits, corresponding to constant parallel time, are of central importance. For 
example, constant-depth circuits of AND, OR and NOT gates of polynomial size (called AC^ ) 
circuits) can add and subtract binary numbers. The class ACC extends AC^ ^ by allowing modular 
counting gates. The class TC^, consisting of constant-depth threshold circuits, can compute 
iterated multiplication. 

In studying quantum circuits, it is natural to consider the power of small depth circuit families. 
Quantum circuit models analogous to the central classical circuit classes have recently been studied 
by Moore and Nilsson and Moore [||. They investigated the properties of classes of quantum 
operators QAC^j, QACC[g], and QNC defined to be analogous to and to contain their classical 
counterparts. This paper is a contribution to this line of research. 

For example, a quantum analog of AC^ - 1 , defined by Moore and denoted QAC^j, is the class 
of families of operators which can be built out of products of constantly many layers consisting of 
polynomial-sized tensor products of one-qubit gates (analogous to NOT's), Toffoli gates (analogous 
to AND's and OR's) and fan-out gates[]. An analog of ACC[g] (i.e., ACC circuit families only 
allowing Mod q gates) is QACC[g], defined similarly to QAC^j, but replacing the fan-out gates 
with quantum Mod g gates (which we denote as MODg). QACC is the same class but we allow 
MODg gates for every q. Moore |8| proves the surprising result QACJJ}= QACC[2] = QACC. This 
is in sharp contrast to the classical result of Smolensky [13] that says ACC(°)[g] + ACC(°)[p] for 
any pair of distinct primes q,p, which implies that for any prime p, AC(°> C ACC(°)[p] C ACC. 
This result showed that parity gates are as powerful as any other mod gates in QACC, but left 
open the complexity of MOD 9 gates for q > 2. 

In ||, Moore conjectured that QACC ^ QACC[g] for odd q. In this paper, we provide the 
missing ingredients to show that in fact QACC= QACC[g] for any q > 2. Moore's result showed 
that parity is as good as any other MOD g gate; our result further shows that any MOD g gate is 
as good as any other. The main technical contribution is the application of the Quantum Fourier 
Transform (using complex q th roots of unity), and encodings of base q digits using qubits. 

We also develop methods for proving upper bounds for language classes related to QACC. Our 
methods result in upper bounds for restricted QACC circuits. Roughly speaking, we show that 
QACC is no more powerful than P/Poly provided that a layer of "wire-crossings" in the QACC 
operator can be written as log many compositions of Kronecker products of controlled-not gates. 
We call this class QACCp° g , where the "pi" is for this planarity condition. We show if one further 

1 The subscript "wf" in the notation denotes "with fan-out." The idea of fan-out in the quantum setting is subtle, 
as will be made clearer later in this paper. See Moore fel for a more in-depth discussion. 
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restricts attention to the case where the number of multi-line gates (gates whose input is more 
than 1 qubit) is log-bounded then the circuits are no more powerful than TCK '. We call this class 
QACC g ° a g tes . These results hold for arbitrary complex amplitudes in the QACC circuits. 

To be more precise, it is necessary to show how a class of operators in QACC can define a 
language, as usually considered in complexity theory. In this paper, we define classes of languages 
EQACC, NQACC, and BQACC based on the expectation of observing a certain state after applying 
the QACC operator to the input state. For example, the class NQACC corresponds to the case 
where x is in the language if the expectation of the observed state after applying the QACC operator 
is non-zero. This is analogous to the definition of the class NQP in Fenner et al. 0. 

In this paper, we show that NQACCjg^is in TC^ and NQACCjj g is in P/poly. Although the 
proof uses some of the techniques developed by Yamakami and Yao jL4| to show that NQPc = 
co-C=P, the small depth circuit case presents technical challenges not present in their setting. In 
particular, given a QACC operator built out of layers Mi, . . . , Mt and an input state \x, P ^), we 
must show that a TC^ ) circuit can keep track of the amplitudes of each possible resulting state 
as each layer is applied. After all layers have been applied, the TC'°) circuit then needs to be 
able to check that the amplitude of one possible state is non-zero. Unfortunately, there could be 
exponentially many states with non-zero amplitudes after applying a layer. To handle this problem 
we introduce the idea of a "tensor-graph," a new way to represent a collection of states. We 
can extract from these graphs (via TC^ ) or P/poly computations) whether the amplitude of any 
particular vector is non-zero. 

The exponential growth in the number of states is one of the primary obstacles to proving that 
all of NQACC is in TC(°) ( or even P/Poly), and thus the tensor graph formalism represents a 
significant step towards such an upper bound. The reason the bounds apply only in the restricted 
cases is that although tensor graphs can represent any QACC operator, in the case of operators with 
layers that might do arbitrary permutations, the top-down approach we use to compute a desired 
amplitude from the graph no longer seems to work. We feel that it is likely that the amplitude of 
any vector in a tensor graph can be written as a polynomial product of a polynomial sum in some 
extension algebra of the ones we work with in this paper, in which case it is quite likely it can be 
evaluated in TC^. 

Another important obstacle to obtaining a TC^ ) upper bound is that one needs to be able 
to add and multiply a polynomial number of complex amplitudes that may appear in a QACC 
computation. We solve this problem. It reduces to adding and multiplying polynomially many 
elements of a certain transcendental extension of the rational numbers. We show that in fact TC^ ) 
is closed under iterated addition and multiplication of such numbers (Lemma |4.1| below). This 
result is of independent interest, and our application of tensor-graphs and these closure properties 
of TC(°) may prove useful in further investigations of small-depth quantum circuits. 

We now discuss the organization of the rest of this paper. In the next section we introduce the 
definitions and notations we use in this paper. Then in the following section we prove QACC[g] = 
QACC. Finally, in the last section, we prove the TC(°) and P/poly upper bounds for the restricted 
classes discussed above. 

2 Preliminaries 

In this section we define the gates used as building blocks for our quantum circuits. Classes 
of operators built out of these gates are then defined. We define language classes that can be 
determined by these operators and give a couple definitions from algebra. Lastly, some closure 
properties of TC*- - 1 are described. 
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Definition 2.1 

By a one-qubit gate we mean an operator from the group U(2). 

LetU = I U °° Um ) e [7(2). A m (U) is defined as: A (U) = U and for m>0, A m (U) is 



/\ m (U)(\x,y}) 



u y o\x, 0) + u y i\x, 1) if A™ =1 Xk = l 
\x,y) otherwise 



1 



1 



. A Tofolli gate is a A m (X) gate for some m > 0. A controlled-not gate is a A\(X) 



LetX 
gate. 

An (m-)spaced controlled-not gate is an operator that maps \yi, . . . , y m , x) to \x © y±, yi . . . , y m , x) 
or \yi, ...,y m ,x) to \x,yx . . . ,y m -i,y m © x) 

An (m-ary) fan out gate F is an operator that maps from \yx, . . . , y m , x) to \x © y%, . . . ,x © y m , x). 
A MOD gir gate is an operator that maps \y\, . . . , y m , x) to \yi, . . . , y m ,x © (J2 Hi mod q = r)). 

We use the following graphical notation for parity (i.e., MOD2) or, in the case of n = 1, for 
controlled-not: 



xi- 
b - 



-0- 



-Xi 

-b®xi 



■ ■ ffi x n 



and for MOB q : 



Xl 



Xl 



b Q b®Mod q (xi,...,x n ) 



As discussed in || , the no-cloning theorem of quantum mechanics makes it difficult to directly 
fan out qubits in constant depth (although constant fan-out is no problem). Thus it is necessary 
to define the operator F as in the above definition; refer to Q for further details. Also, in the 
literature it is frequently the case that one says a given operator M on \yi, . . . , y m ) can be written 
as a tensor product of certain gates. What is meant is that there is an permutation operator II ( 
a map \yi, . . . , y m ) to ly^i), • • • , y^m)) f° r some permutation tt) such that 

M\yi, ...y m )=U ©" MjU-^yi, ...y m ) 

where the Mj's are our base gates, i.e., those gates for which no inherent ordering on the yi 
is assumed a priori. Since it is important to keep track of such details in our upper bounds 
proofs, we will always use Kronecker products of the form ©™Mj without unspoken permutations. 
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Nevertheless, being able to do permutation operators (not conjugation by a permutation) intuitively 
allows our circuits to simulate classical wire crossings. To handle permutations, we allow our 
circuits to have controlled-not layers. A controlled-not layer is a gate which performs, in one step, 
controlled-not's between an arbitrary collection of disjoint pairs of lines in its domain. That is, 
it performs II (g>™ Ai(X)IT _1 for some permutation operator II. Moore Nilsson Q show that any 
permutation can be written as a finite product of controlled-not layers. We say a controlled-not 
layer is log-depth if it can be written as the composition of log many matrices each of which is the 
Kronecker product of identities and spaced controlled-not gates. 

M® n is the re-fold Kronecker product of M with itself. The next definitions are based on 
Moore §]. 

Definition 2.2 

QACffl is the class of families {F n }, where F n is in JJ{2 n+p ^), p a polynomial, and each F n is 
writable as a product of 0(log fc n) layers, where a layer is a Kronecker product of one-qubit gates 
and Toffoli gates or is a controlled-not layer. Also for all re the number of distinct types of one 
qubit gates used must be fixed. 

QAC&Q[q] is the same as QACf® except we also allow MOB q gates. QACCf® = U 9 QA <X< fc ) [q] . 
QAC^j is the same as QAC^ k ^ but we also allow fan- out gates. 

QACCis defined as QACCf® and QACC[q] is defined as QA CC< ) [q] . QACd^is Q AC C restricted 

to log-depth controlled not layers. QAC(}g^ tes is QACC restricted so that the total number of multi- 
line gates in all layers is log-bounded. 

If C is one of the above classes, then Ck are the families in C with coefficients restricted to K. 

Let {F n } and {G n }, G n ,F n € U(2 n ) be families of operators. We say {F n } is QAC^ reducible 
to {G n } if there is a family {R n }, R n € U(2 n+P ^) of QAC^ ^ operators augmented with operators 
from {G n } such that for all n, x, y G {0,1}™, there is a setting of z±, z p r n \ £ {0,1} for which 
(y|i^|x) = (y, z|i? n |x, z). Operator families are QAC^ ) equivalent if they are QAC^ ^ reducible to 
each other. If C\ and C% are families of QA equivalent operators, we write C\ = C% . 

We refer to the z%'s above as "auxiliary bits" (called "ancillae" in pfl). Note that in proving 
QAC*- ) equivalence, the auxiliary bits must be returned to their original values in a computation. 

It follows for any {F n } £ QAC^ ) that F n is writable as a product of finite number of layers. 
Moore §§ shows QAC^j = QACC [2] = QACC. Moore § places no restriction on the number 
of distinct types of one-qubit gates used in a given family of operators. We do this so that the 
number of distinct amplitudes which appear in matrices in a layer is fixed with respect to n. This 
restriction arises implicitly in the quantum Turing machine case of the upper bounds proofs in 



Fenner, et al. || and Yamakami and Yao [14]. Also, it seems fairly natural since in the classical 
case one builds circuits using a fixed number of distinct gate types. Our classes are, thus, more 
"uniform" than Moore's. We now define language classes based on our classes of operator families. 

Definition 2.3 Let C be a class of families of U(2 n+P ^) operators where p is a polynomial and 
n = \x\. 

1. E-C is the class of languages L such that for some {F n } £ C and {{z n \} = {(^n,i> • • • , z n,n+p(n)\} 
a family of states, m := |(z n |i ? n |x,0 2:) ^ n ^)| 2 is 1 or and x £ L iff m = 1. 

2. N-C is the class of languages L such that for some {F n } G C and {{z n \} a family of states, 
x£L iff\{z n \F n \xM n) )? >U. 



4 



3. B-C is the class of languages L so that for some {F n } E C and{(z\}, x £ L if\(z n \F n \x,0 p ^)\ 2 > 
3/4 andx^L if \(z n \F n \x,0 p ^}\ 2 < 1/4 . 

It follows E-C C N-C and E-C C B-C. We frequently will omit the '•' when writing a class, 
so E-QACC is written as EQACC. Let := F n \x,W>^). Notice that \{z n \F n \xM n) )\ 2 = 
(ty\P\2 n \\ty) , where P\z n \ is the projection matrix onto \z n ). We could allow in our definitions 
measurements of up to polynomially many such projection observables and not affect our results 
below. However, this would shift the burden of the computation in some sense away from the 
QACC operator and instead onto preparation of the observable. 

Next are some variations on familiar definitions from algebra. 

Definition 2.4 Let k > 0. A subset {Pi}i<i<k ofC is linearly independent ifJ2i=i a iPi 7^ f or an V 
(ai,...,a fc ) E Q k -{0 k }. A set {&} i<i<k is algebraically independent if the only p E Q[ X\ , • • • , Xk\ 
with p(Pi, . . . ,0k) = is the zero polynomial. 

We now briefly mention some closure properties of TC^ computable functions that are useful 
in proving NQACC^ es C TC^ ). For proofs of the statements in the next lemma see [11, [12], J3|. 



Lemma 2.5 (1) TC^ ^ functions are closed under composition. (2) The following are TC^ ^ com- 
putable: x + y, x — y := x — y if x — y > and otherwise, \x\ := [log 2 (:c + 1)], x ■ y, 
[x/y\, 2 min ( i 'P(M), and cond(x,y,z) := y if x > and z otherwise. (3) If f(i,x) is a TC* ) 

computable then EfcS? njfi'VCM), Vi < P(\ x \)(f(h x ) = 0); 3i < P(\x\)(f(i,x) = 0), 
and fj,i < p(\x\)(f(i,x) = 0) := the least i such that f(i,x) = or p(x) + 1 otherwise, are TCfi> 
computable. 

We drop the min from the 2 ram ^ i ' p ^ x ^ when it is obvious a suitably large can be found. We 

define max(x, y) := cond(l — (y — x)),x, y) and define 

"*«=*<?([*[)(/(*)) := 

{jii < p{\x\) )(Vj < p(\x\)(f(j) - /(*) = 0) 

Using the above functions we describe a way to do sequence coding in TCK ). Let /3m {x, w) := 
[(w - [w/2( x+1 W\ ■ 2^+ 1 )l*l)/2 :c l i lj. The function (3\q is useful for block coding. Roughly, 0\ t \ first 
gets rid of the bits after the (x + l)|t|th bit then chops off the low order x\t\ bits. Let B = 2' max ( a: '2/)l ) 
so that B is longer than either x or y. Hence, we code pairs as (x,y) := (B + y) ■ 2B + B + x, 
and projections as (w)i := | w |j^i(0, P^i ^ (0, w)) and (w) 2 := ^^(0, Pyi (1, w)). We 

can encode a poly-length, TCK°) computable sequence of numbers (/(l), ■ ■ ■ , /(&)) as the pair 
(Z)i ! (/(*)2*' m )) in) where m := |/(maxj(/(i)))| + 1. We then define the function which projects out 
the ith member of a sequence as f3(i,w) := f5t w \ 2 {i,w). 

We can code integers using the positive natural numbers by letting the negative integers be the 
odd natural numbers and the positive integers be the even natural numbers. TC^ ^ can use the 
TCK ) circuits for natural numbers to compute both the polynomial sum and polynomial product 
of a sequence of TC(°) definable integers. It can also compute the rounded quotient of two such 
integers. For instance, to do a polynomial sum of integers, compute the natural number which is 
the sum of the positive numbers in the sum using cond and our natural number iterated addition 
circuit. Then compute the natural number which is the sum of the negative numbers in the sum. 
Use the subtraction circuit to subtract the smaller from the larger number and multiply by two. 
One is then added if the number should be negative. For products, we compute the product of 
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the natural numbers which results by dividing each integer code by two and rounding down. We 
multiply the result by two. We then sum the number of terms in our product which were negative 
integers. If this number is odd we add one to the product we just calculated. Finally, division can 
be computed using the Taylor expansion of 1/x. 

3 QACC%] 

In this section, we show QACC%]=QACC for any q > 2. 

Let q € N, q > 2 be fixed throughout this discussion. Consider quantum states labelled by 
digits in D = {0, q — 1}. By analogy with "qubit," we refer to a state of the form, 

q-1 
k=0 

with J2k \ c k\ 2 = 1 as a "qudigit." Direct products of the basis states will be labelled by lists of 
eigenvalues, e.g., \x)\y) is denoted as \x,y). 

We define three important operations on qudigits. The n-ary modular addition operator M q 
acts as follows: 

M q \xi, ...,x n ,b) = \xi, ...x n , (b + xi + ... + x n ) mod q) 
The gate is represented graphically as in the following figure: 



Xl 



■ Xl 

■ %n 

(b + xi + ... + x n ) mod q 



Since M q merely permutes the states, it is clear that it is unitary. Similarly, the n-ary unitary 
base q fanout operator F q acts as, 

F q \xi, ...x n , b) = \(x\+ b) mod q, ...(x n + b) mod q, b). 

We write F for F2, since it is the "standard" fan-out gate introduced by Moore (see Definition 2.1). 
Note that M q l = M^ 1 and F' 1 = 

Finally, the Quantum Fourier Transform H q (which generalizes the Hadamard transform H on 
qubits) acts on a single qudigit as, 

Vl 6=0 

— th 

where ( = e « is a primitive complex q root of unity. It is easy to see that H q is unitary, via the 
fact that Y%Ll = iff a mod q. 

The first observation is that, analogous to parity and fanout for Boolean inputs, the operators 
M q and F q are "conjugates" in the following sense. 

Proposition 3.1 M q = {Hf {n+1) )- l F q l Ht {n+1) . 



6 



Proof. We apply the operators Hf^ n+1 \ F q 1 , and (Iff <n+1 - ) ) 1 in that order to the state 
\x±, x n , b), and check that the result has the same effect as M q . 

The operator ff®( ra+1 ) simply applies H q to each of the n + 1 qudigits of \x±, ...,x n ,b), which 
yields, 

_L_ J2c- y+ab \yi,.:,y n ,a), 

q 2 ye n" a =0 

where y is a compact notation for yi, ■■■,y n , and x • y denotes Ya=i x iV%- Then applying F q l to 
the above state yields, 



1 



EEc 



x-y+ab 



q 2 ygDn a =0 



|( yi -a) mod g, (y n - a) mod g,a). 
By a change of variable, the above can be re-written as, 

Finally, applying (If® ( - n+1 ' ) )~ 1 to the above undoes the Fourier transform and puts the coefficient 
of a in the exponent into the last slot of the state. The result is, 

(H^ +1 Y 1 F q 1 H^\x 1 ,...,x n ,b) = 
\xi, (b + xi + ... + x n ) mod q), 

which is exactly what M q would yield. 

□ 

We now describe how the operators M q , F q and H q can be modified to operate on registers 
consisting of qubits rather than qudigits. Firstly, we encode each digit using [log 9] bits. Thus, for 
example, when q = 3, the basis states |0),|1) and |2) are represented by the two-qubit registers 
1 00) , 1 01) and 1 10) , respectively. Note that there remains one state (in the example, |11)) which 
does not correspond to any of the qudigits. In general, there will be 2 T lo g <?1 _ q SU ch "non-qudigit" 
states. M q , F q and H q can now be defined to act on qubit registers, as follows. Consider a state \x) 
where x is a number represented as m bits (i.e., an m-qubit register). If m < [log q~\ , then H q leaves 
\x) unaffected. If0<x<g — 1 (where here we are identifying x with the number it represents), 
then H q acts exactly as one expects, namely, H q \x) = (l/y^) J2y=o C xy \y}- ^ x > q, again H q leaves 
|x) unchanged. Since the resulting transformation is a direct sum of unit matrices and matrices of 
the form of H q as it was originally set down, the result is a unitary transformation. M q and F q can 
be defined to operate similarly on m-qubit registers for any m: Break up the m bits into blocks of 
[logg] bits. If m is not divisible by [logg], then M q and F q do not affect the "remainder" block 
that contains fewer than [logg] bits. Likewise, in a quantum register \x\, ...,x n ) where each of the 
Xj's (with the possible exception of x n ) are [log g] -bit numbers, M q and F q operate on the blocks 
of bits x±, ...,x n exactly as expected, except that there is no affect on the "non-qudigit" blocks (in 
which Xi > q), or on the (possibly) one remainder block for which \x n \ < [log 9]. Since M q and F q 
operate exactly as they did originally on blocks representing qudigits, and like unity for non-qudigit 
or remainder blocks, it is clear that they remain unitary. 
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Henceforth, M q , F q , and H q should be understood to act on qubit registers as described above. 
Nevertheless, it will usually be convenient to think of them as acting on qudigit registers consisting 
of [log q] qubits in each. 

Lemma 3.2 F q and M q are QA&® -equivalent. 



Proof. By Barenco et al. Q, any fixed dimension unitary matrix can be computed in fixed depth 
using one-qubit gates and controlled nots. Hence H q can be computed in QAC^ ), as can Hf^ n+1 \ 



The result now follows immediately from Proposition |3.1| . | — | 

The classical Boolean Modq-function on n bits is defined so that Mod q (xi, ...,x n ) = 1 iff 
YA=i x i — (mod q). (The more common definition sets it to 1 if J27=i x i ^ s n °t divisible by 
q, but this convention is less convenient in this setting, and is not important technically either.) 
We also define Mod 9)7 .(xi, ...,x n ) to output 1 iff J27=i x i = r (mod q). Note that Mod g = Mod 9i o. 
Reversible, quantum versions of these functions can also be defined. The operator MOD^ on n + 1 
qubits has the following effect: 

\xi, ...,x n ,b) i-> \xi,...,x n ,b®Mod qtr (xi,...,x n )). 

We write MOD 9 o as MOD g . Since negation is built into the output (via the exclusive OR), it 
is easy to simulate negations using MOD gr gates. For example, by setting b = 1, we can compute 
-iMod^,.. More generally, using one auxiliary bit, it is possible to simulate "-iMOD^ r ," defined so 
that, 

\xi, ...,x n ,b) i-> ...,x n ,b® (->Mod g , r (xi, .. .,£„))), 

using just MOD g r and a controlled-NOT gate. Thus MOD g r and -iMOD gr are QAC^ ) -equivalent. 
Moore's version of MOB q is our -nMOD 9 . Observe that MOD"* = MOD gir . 

Lemma 3.3 MOD g and M q are Q AO® -equivalent. 



Proof. First note that MOD g and MOD 9ir are equivalent, since a MODq ir gate can be simulated 
by a MODg gate with q — r extra inputs set to the constant 1. Hence we can freely use MOD^ r 
gates in place of MOD g gates. 

It is easy to see that, given an M q gate, we can simulate a MOD g gate. Applying M q to n + 1 
digits (represented as bits, but each digit only taking on the values or 1) transforms, 

\x%, ...,x n ,Q) i ^ \xi, ...,x n , (^Txi) mod q). 

i 

Now send the bits of the last block (J2i x i mod q) to a Toffoli gate with all inputs negated and 
control bit b. The resulting output is exactly 6©Mod 9 (xi, ...,x n ). The bits in the last block can be 
erased by re-negating them and reversing the A/q gcite. This leaves only xi, 0(n) auxiliary 

bits, and the output b © Mod 9 (xi, x n ). 

The converse (simulating M q given MOD g ) requires some more work. The first step is to show 
that MODq can also determine if a sum of digits is divisible by q. Let x%, ...,x n G D be a set of 
digits represented as [logg] bits each. For each i, let xf^ (0 < k < [logg] — 1) denote the bits of 
Xi. Since the numerical value of Xi is EEo" 1 " 1 ^^", it follows that 

n |"l°g 0] - 1 n 

i=l fc=0 i=l 
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The idea is to express this last sum in terms of a set of Boolean inputs that are fed into a 

h (k) h 

MOD 9 gate. To account for the factors 2 , each x\ is fanned out 2 times before plugging it 
into the MOD 9 gate. Since k < [log q], this requires only constant depth and 0(n) auxiliary bits 
(which of course are set back to in the end by reversing the fanout). Thus, just using MOD ? 
and constant fanout, we can determine if Ya=i x i = (mod q). More generally, we can determine 
if Y7i=i x i = r (mod q) using just a MODq ir gate and constant fanout. Let MOD ?)r (xi, x n ) 
denote the resulting circuit, that determines if a sum of digits is congruent to r mod q. The 
construction of MOD g r (xi, ...,x n ) is illustrated in the figure below for the case of q = 3. In the 
figure, mod(x) denotes Mod3 jr (xi, ...,x n ). The notation on the right will be used as a shorthand 
for this circuit: 
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We can get the bits in the value of the sum J27=i x * m °d Q us i n g MOD gjr circuits. This is done, 
essentially, by implementing the relation x mod q = J2r=o r ' Mod 9jr (x). For each r, < r < q — 1, 
we compute Mod ?;r (xi , x n ) (where now the x^s are digits). This can be done by applying the 
MODg ir circuits in series (for each r) to the same inputs, introducing an auxiliary 0-bit for each 
application, as illustrated here. 



X\ 



X\ 



o — |g»o- 

o 

o 



=J=j Mod gj0 (ari, ...,x n ) 

h l pJ— | ' Modg 5 i(xi, ...,x n ) 
q,2— Mod g ,2(xi, ...,x n ) 



Let rfc denote the k th bit of r. For each r and for each k, we take the AND of the output 
of the MOD q r with (again by applying the AND's in series, which is still constant depth, but 
introduces q extra auxiliary inputs). Let a^ jr denote the output of one of these AND's. For each k, 
we OR together all the a^^'s, that is, compute V^zja^j., again introducing a constant number of 
auxiliary bits. Since only one of the r's will give a non-zero output from MOD^,,, this collection of 
OR gates outputs exactly the bits in the value of Ya=i x i m °d q- Call the resulting circuit C, and 
the sum it outputs S. 
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Finally, to simulate M q , we need to include the input digit b £ D. To do this, we apply a unitary 
transformation T to \S, b) that transforms it to \S,(b + S) mod q). By Barenco, et al. |l| (as in the 
proof of Lemma 3.2), T can be computed in fixed depth using one-qubit gates and controlled NOT 
gates. Now using S and all the other auxiliary inputs, we reverse the computation of the circuit 
C, thus clearing the auxiliary inputs. This is illustrated in this figure: 
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(b + S) mod q 



The result is an output consisting of xi, ...,x n , 0{n) auxiliary bits, and (b + J2"i=i x i) mod q, 
which is the output of an M q gate. q 

It is clear that we can fan out digits, and therefore bits, using an F„ gate (setting X{ = for 
1 < i < n fans out n copies of b). It is slightly less obvious (but still straightforward) that, given 
an F q gate, we can fully simulate an F gate. 

Lemma 3.4 For any q > 2, F and F q are QAd^> -equivalent. 



Proof. By the preceeding lemmas, F q and MOD g are QAC^ ^ -equivalent. By Moore's result, 
MODg is QAC(°) -reducible to F. Hence F q is QAC^ -reducible to F. 

Conversely, arrange each block of [log q~\ input bits to an F q gate as follows. For the control-bit 
block (which contains the bit we want to fan out), set all but the last bit to zero, and call the 
last bit b. Set all bits in the i th input-bit block to 0. Now the i th output of the F q circuit is b, 
represented as [logg] bits with only one possibly nonzero bit. Send this last output bit b and the 
input bit Xi to a controlled-NOT gate. The outputs of that gate are b and 6© x%. Now apply F~ l to 
the bits that were the outputs of the F q gate (which are all left unchanged by the controlled-not's). 
This returns all the 6's to except for the control bit which is always unchanged. The outputs 
of the controlled-not's give the desired b © X{. Thus the resulting circuit simulates F, with 0(n) 
auxiliary bits. | — | 



Theorem 3.5 For anyq£~N,q^ I, QACC = QACCfq]. 



Proof. By the preceeding lemmas, fanout of bits is equivalent to the MOD g function. By Moore's 
result, we can do MOD g if we can do fanout in constant depth. By our result, we can do fanout, 
and hence MOD 2 , if we can do MODq. Hence QACC = QACC [2] C QACC[g]. □ 



4 Upper Bounds 

In this section, we prove the following upper bounds results NQACCj^C TC( \ BQACCQ^ ateB C 
TC(°), NQACCpfC P/poly, and BQACC^C P/poly. 



10 



Suppose {F n } and {z n } determine a language L in NQACC. Let F n be the product of the 
layers U\, . . . , U% and E be the distinct entries of the matrices used in the Uj's. By our definition 
of QACC, the size of E is fixed with respect to n. We need a canonical way to write sums and 
products of elements in E to be able to check |(^|J7i • • • Ut\x,O p ^)\ 2 > with a TO*- -* function. 
To do this let A = {aj}i<j< m be a maximal algebraically independent subset of E. Let F = Q(A) 
and let B = {Pi}o<i<d be a basis for the field G generated by the elements in (E — A) U {1} over 
F. Since the size of the bases of F and G are less than the cardinality of E the size of these bases 
is also fixed with respect to n. 

As any sum or product of elements in E is in G, it suffices to come up with a canonical form 
for elements in G. Our representation is based on Yamakami and Yao [14]. Let a £ G. Since B 
is a basis, a = YljZo ^jPj f° r some <\j £ F. We encode an a as a (f-tuple (we iterate the pairing 
function from the preliminaries to make d-tuples) (^Xq\ ■ ■ ■ , where [Xjl encodes Xj. As the 

elements of A are algebraically independent, each Xj = Sj/uj where Sj and Uj are of the form 

m 

kj . | kj | c 

Here kj = {k\j , . . . , k m j) £ Z m , |fcj-| is J2ikij, £ Z, and e £ N. In particular, any product 

/9m " A = Sj=o ^i/9j with Aj = Sj/uj and Sj and Uj in this form. We take a common denominator 
it for elements of E U {(3 m ■ Pi} and not just E since the Aj's associated with the p m ■ Pi might 
have additional factors in their denominators not in E. Also fix an e large enough to bound the 
\kj\'s which might appear in any element of E or a product P m ■ Pi. This e will be constant with 
respect to n. In multiplying t layers of QACC circuit against an input, the entries in the result 
will be polynomial sums and products of elements in E U {P m ■ Pi}, so we can bound \kj\ for kj's 
which appear in the Aj's of such an entry by e -p(n). To complete our representation of a £ G we 
encode Xj as the sequence (r, ((o.-., k\j, ... , k m j))) where r is the power to which u is raised and 
((a fc - ,kij,..., k m j)) is the sequence of (a fc - , k\j, ... , fc m j)'s that appear in Sj. By our discussion, the 
encoding of an a that appears as an entry in the output after applying a QACC operator to the 
input is of polynomial length and so can be manipulated in TC^. 
We have need of the following lemma: 

Lemma 4.1 Let p be a polynomial. (1) Let f(i,x) £ TC^ ^ output encodings of a,i x £ Z[A]. Then 
Z[A] encodings of Yn=i l) <H,x and U^ l) a ijX are TC<°) computable. (2) Let f(i,x) £ TC*™ output 
encodings of Oj jX £ G. Then G encodings of Yh=x a i,x o-nd nf=i a i,x are TC^ ^ computable. 



Proof. We will abuse notation in this proof and identify the encoding f(i,x) with its value a{ tX . 
So J2i f{h x ) an( i Hi f(h x ) w iH mean the encoding of J2i a i,x and Jli a i,x respectively. 

(1) To do sums, the first thing we do is form the list LI = (/(0, x), . . . , f(p(\x\), x)). Then 
we create a flattened list L2 from this with elements which are the (a^ , kij, . . . , k m j)'s from the 

f(i, x)'s. LI is in TCW using our definition of sequence from the preliminaries, and closure under 
sums and maxi to find the length of the longest f(i,x). To flatten LI we use maxi to find the 
length d of the longest f(i,x) for i < p(\x\). Then using max twice we can find the length of the 
longest (a,*, k%j, . . . , k m j). This will be the second coordinate in the pair used to define sequence 
L2. We then do a sum of size cZ-p(|x|) over the subentries of LI to get the first coordinate of the pair 
used to define L2. Given L2, we make a list L3 of the distinct fcj's that appear as (a^ , kij, . . . , k m j) 
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in some f(i,x) for some i < This list can be made from L2 using sums, cond and u. We 

sum over the t < length(L2) and check if there is some t' < t such that the t'th element of L2 has 
same kj as t and if not add the tth elements kj times 2 raised to the appropriate power. We know 
what power by computing the sum of the number of smaller t' that passed this test. Using cond 
and closure under sums we can compute in TCK°) a function which takes a list like L2 and a kj and 
returns the sum of all the a,- 's in this list. So using this function and the lists L2 and L3 we can 

Kj 

compute the desired encoding. 

For products, since the a^'s of A are algebraically independent, Zi[A] is isomorphic to the 
polynomial ring Z[yi, . . . , y m ] under the natural map which takes a.j to yj. We view our encodings 
f(i,x) as m-variate polynomials in Z[yi, . . . ,y m ]. We describe for any p' a circuit that works for 
any TC^ computable f(i,x) such that Ylif(i,%) is of degree less than p' viewed as an m-variate 
polynomial. In TC^ we define g(i, x) to consist of the sequence of polynomially many integer values 
which result from evaluating the polynomial encoded by f(i,x) at the points (i\, . . . ,i m ) £ N" 1 
where < i s and J2 S *s — p' ■ To compute f(i, x) at a point involves computing a polynomial sum of 
a polynomial product of integers, and so will be in TC^°\ Using closure under polynomial integer 
products we compute k(j, x) := ]Tj P{j, g(,h x )) where /3 is the sequence projection function from the 
preliminaries. Our choice of points is what is called by Chung and Yao [|| the p'-th order principal 
lattice of the m-simplex given by the origin and the points p' from the origin in each coordinate 
axis. By Theorems 1 and 4 of that paper (proved earlier by a harder argument in Nicolaides ||) the 
multivariate Lagrange Interpolant of degree p' through the points k(j, x) is unique. This interpolant 
is of the form P(yi, ■ ■ ■ ,y m ) = 2~2jPj(lJi, ■ ■ ■ iUm)k(j,x) where the pj's are polynomials which do 
not depend on the function /. An explicit formula for these pj's is given in Corollary 2 of Chung 
and Yao Q as a polynomial product of linear factors. Since these polynomials are all of degree less 
than p' , they have only polynomial in p' many coefficients and in PTIME these coefficients can be 
computed by iteratively multiplying the linear factors together. We can then hard code these pfs 
(since they don't depend on /) into our circuit and with these pj's, k(j, x), and closure under sums 
we can compute the polynomial of the desired product in TC^ ). 

(2) We do sums first. Assume f(i,x) := Y^Zo ^ijPj- One immediate problem is that the Xij 
and Aj/j might use different u r, s for their denominators. Since TC^ ^ is closed under poly-sized 
maximum, it can find the maximum value ro to which u is raised. Then it can define a function 
g(i,x) = Y^jZolij^j which encodes the same element of G as f(i,x) but where the denominators 
of the 7i/s are now u r ° . If \j was Sj/u r we need to compute the encoding Sj ■ u r °~ r /u r ° . This is 
straightforward from (1). Now 

p(M) p(M) d-i p(M) 

£ /(<,*) = £ g(i,x) = £[(£ Sij )/u^, 

i=l i=l j=0 i=l 

where Sy's are the numerators of the Jij's in g(i,x). From part (1) we can compute the encoding 
e j °f (Z)i=i s ij) m TC(°). So the desired answer ((ro, eo), • • • , (ro, &d-l)) is m TC^ ). 
For products nf=i ^ /(* 

,x), we play the same trick as the in the Zi[A] product case. We view 
our encodings of elements of G as d-variate polynomials in F(yo, . . . , yd-i) under the map /3 K goes 
to y K . (Note that this map is not necessarily an isomorphism.) We then create a function g(i,x) 
which consists of the sequence of values obtained by evaluating f(i, x) at polynomially many points 
in a lattice as in the first part of this lemma. Evaluating f(i,x) at a point can easily be done 
using the first part of this lemma. We then use part (1) of this lemma to compute the products 
fcO'i x ) = Ptii 9{h x)). We then get the interpolant P(y , y d _i) = J2j Pj(yo, y m )k(j, x). We 
non-uniformly obtain the encoding of Pj((5o, . . . ,@d-i) expressed as an element of G. i.e., in the 
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form Et=o Thus, the product flfi? /(*,*) is 



d-l 



w=0 j 

The encoding of the products is the d-tuple given by ■ Xjok(j, 0), . . . , ^jd-ik(j, d — 1)) Each 
of its components is a polynomial sum of a product of two things in F and can be computed using 
the first part of the lemma. q 

For {F n } £ QAC^j = QACC, the vectors that F n act on are elements of a 2 n+p ( n ) dimensional 
space f i jn +p( n ) space which is a tensor product of the 2-dimensional spaces £ i, . . . £ n + p ( n ), which in 
turn are each spanned by |0), |1). We write for the subspace <8>fLj£i of £i jn + p ( n ). We now define 
a succinct way to represent a set of vectors in £i : n+p(n) which is useful in our argument below. A 
tensor graph is a directed acyclic graph with one source node of indegree zero, one terminal node of 
outdegree zero, and two kinds of edges: horizontal edges, which are unlabeled, and vertical edges, 
which are labeled with a pair of amplitudes and a product of colors and anticolors. (The color 
product may be the number 1.) We require that all paths from the source to the terminal traverse 
the same number of vertical edges and that no vertex can have vertical edge indegree greater than 
one or outdegree greater than one. For a color c we write c for its corresponding anticolor. The 
height of a node in a tensor graph is the number of vertical edges traversed to get to it on any path 
from the source; the height of an edge is the height of its end node. The width of a tensor graph is 
maximum number of nodes of the same height. As an example of a tensor graph where our color 
product is the number 1, consider the following figure: 



{1} 0,1 



l , v5' V2 



{1} 1/2,0 



O 30 
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The rough idea of tensor graphs is that paths through the graph correspond to collections of vector 
in £i jn . For this particular figure the left path from the source node (s) to the terminal node (t) 
corresponds to the vectors given by 

|1>®(^|0> + ^|1>)®^|0> 

and the right hand path corresponds to 

|0>®(^|0> + ^|1»®^|0>. 

A Sj^-term in a tensor graph is a maximal induced tensor subgraph between a node of height 
j — I and a node of height k. We also require that the horizontal indegree of the node at height 
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j — 1 be zero and that the horizontal outdegree of the node at height k be zero. For the graph we 
considered above there are two £ i^-terms and two ^2,3-terms but only one £ i^-term corresponding 
to the whole figure. 

Colors are used to handle controlled-not layers. A color c and its anticolor c satisfy the following 
multiplicative properties: c ■ c = c ■ c = 1 and c • c = 0. Given two distinct colors b and c we have 
b ■ c = c ■ b and b ■ c = c ■ b. If a is a product of colors and anticolors not involving the color b or b 
and c is another product of colors we have a(bc) = (ab)c. We consider formal sums of products of 
complex numbers times colors. We require complex numbers to commute with colors and require 
colors and anticolors to distribute, i.e., if a, b, c are colors or anticolors then a - (b + c) = a ■ b + a ■ c 
and (b + c) - a = b-a + c-a. Finally, we require addition to work so that the above structure satisfies 
the axioms of an C-algebra. Given a tensor graph G denote by Ac the C-algebra above. Since 

(a • a) ■ a = a / = a • (a • a) 

this algebra is not associative. However, in the sums we will consider the terms will never have more 
than two positions where a color or its anticolor can occur, so the products we will consider are 
associative. Using our our earlier encoding for the elements of C which could appear in a QACC 
computation, it is straightforward to use sequence coding to get a TC^ encodings of the relevant 
elements of Ag- As an example of how colors affect amplitudes, consider the following picture: 



{b} 1, 




mi 



V2 J V2 



{b} 0,1 



The amplitude of |1,0, 0) in the left hand dotted path is6-^=-l-;^=-&-l = 1/2 using commutativity 
and b 2 = 1. Its amplitude in the right hand dotted path would be zero because of the last vertical 
edge. However, vectors such as |0,0, 1) would have nonzero amplitude in the right hand dotted 
path. Nevertheless, the amplitude of any vector \x) in any path other than the dotted ones from 
s to t will be as b ■ b = 0. More formally, we define the amplitude of an \x) in a vertical edge as 
equal to the left amplitude times the color product in the edge if x is |0) and equal to the right 
amplitude times the color product in the edge if x is 1. The amplitude of a vector \x±, . . . , Xj) in 
a path in a tensor graph is the product over k from 1 to j of the amplitude of the vectors \xk) in 
the vertical edge of height k. The amplitude of a vector \xj, . . . ,Xk) in an (f^-term is the sum of 
its amplitude in its paths. The amplitude of a vector \x±, . . . , x p ( ra )) in a tensor graph G is defined 
to be the sum of its amplitudes in G"s £ liP( - n )-terms. 

As we will be interested in families of tensor graphs {G n }, corresponding to our circuit families 
we want to look at those families with a certain degree of uniformity. We say a family of tensor 
graphs {G n } is color consistent if: (1) the number of colors for edges of the same height is bounded 
by a constant k with respect to n, (2) the number of heights in which a given color /anticolor can 
appear is exactly two (colors and their anticolors must appear on the same heights), (3) each color 
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product at the same height is of the form nf=o ^* where l{ must be either a color q or c% (it follows 
there are 2 fc possible color products for edges at a given height). We say that a color/anticolor 
is active at a given height if the height is at or after the first height at which the color/anticolor 
occurs and is below the height of its second occurrence. The family is further said to be log-color 
depth if the number of active colors/anticolors of a given height is log-bounded. 

Theorem 4.1 Let {F n } be a family of QACC operators and let {{z n \} a family of observables. (1) 
There is a color- consistent family of tensor graphs of width 2 22 ' and polynomial size representing 
the output amplitudes ofU\--- Ut\z n ) where Ui are the layers of F n . (2) If {F n } is in Q AC C 1 ^ then 

the family of tensor graphs will be of log-color depth. (3) If {F n } is in QACC}°^ tes then the number 
of paths from the source to the terminal node is polynomially bounded. 



Proof. The proof is by induction on t. In the base case, t = 0, we do not multiply any layers, and 
we can easily represent this as a tensor graph of width 1. Assume for j < t that Uj ■ ■ ■ U\\x, p (™)) 
can be written as color consistent tensor graph of width 2 2 and polynomial size. There are two 
cases to consider: In the first case the layer is a tensor product of matrices Mi ® ■ ■ ■ <8> M v where 
the Mfc's are Toffoli gates, one qubit gates, or fan-out gates (since QAC^j =QACC); in the second 
case the layer is a controlled- not layer. 

For the first case we "multiply" Ut against our current graph by "multiplying" each Mj in 

parallel against the terms in our sum corresponding to M,-'s domain, say Ey y ■ If Mj = ( U ° U01 | 

yuio nil J 

with domain £ j> is a one-qubit gate, then we multiply the two amplitudes in each vertical edge 
of height j' in our tensor graph by Mj. This does not effect the width, size, or number of paths 
through the graph. If Mj is a Toffoli gate, then for each term S in £ y # in our tensor graph we 
add one new term to the resulting graph. This term is added by adding a horizontal edge going 
out from the source node of S followed by the new £ ji ^/-term followed by a horizontal edge into 
the terminal node of S. The new term is obtained from the old one by setting to the left hand 
amplitudes of all edges in S of height between j 1 and k' — 1 and then if a, 7 is the amplitude of an 
edge of height k' in the new term we change it to 7 — a, a — 7. This new term adjusts the amplitude 
for the case of a |1) ^ ~- J _1 ) vector in £ji^_\ tensored with either a |0) or |1). This operation 
increases the width of the new tensor graph by the width of the fj/^-term for each £ j/^-term in 
the graph. Since the original graph has width 2 22<i r) there are at most this many starting and 
ending vertices for such terms. So there at most (2 22( * J> ) 2 such terms. Each of these terms has 
width at most 2 22( * X) . Thus, the new width is at most 

2 2»(*-D +(2 2»(*-i) ) 2. 2 2»(«-i) <2 2« 

Notice this action adds one new path through the £j',k' P ar t of the graph for every existing one. 

Now suppose Mj is a fan-out gate, let S be a ^-/^/-term in our tensor graph and let e be any 
vertical edge in S in £y. Suppose e has amplitude a for |0) and amplitude 7 for |1). In the new 
graph we change the amplitude of e to a, 0. We then add a horizontal edge out of the source node 
of S followed by a new £ ji ^/-term followed by a horizontal edge into the terminal node of S. The 
new term is obtained from S by changing the amplitude for edges in £y with amplitudes 0,7 in S 
to 0, 7. The amplitudes of the non-£/v edges in this term are the reverse of the corresponding edge 
in S, i.e., if the edge in S had amplitude S, £ then the new term edge would have amplitude C,<5- 
The same argument as in the Toffoli case shows the new width is bounded by 2 2 and that this 
action adds one new path through the £j',k' part of the graph for every existing one. 
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For the case of a controlled-not layer, suppose we have a controlled-not going from line i onto 
line j. Let c, c be a new color, anti-color pair not yet appearing in the graph. Let ej be a vertical 
edge of height i in the graph and let Cj, «j, 7j be respectively its color product and two amplitudes. 
Similarly, let ej be a vertical edge of height j in the graph and Cj, ctj, jj be its color product and 
two amplitudes. In the new graph we multiply c times the color product of ej and ej and change 
the amplitude of ej to Oj, 0. We then add a horizontal edge going out from the starting node of ej, 
followed by a vertical edge with values Cj • c, 0, 7$ followed by a horizontal edge into the terminal 
node of ej. In turn, we add a horizontal edge going out of the starting node of ej, followed by 
a vertical edge with values Cj ■ c,ji, Oj followed by a horizontal edge into the terminal node of 
ej. We handle all other controlled gates in this layer in a similar fashion (recall they must go 
to disjoint lines). We add at most a new vertex of a given height for every existing vertex of a 
given height. So the total width is at most doubled by this operation and 2 • 2 22(t x> < 2 22 \ In 
the QACCjf case, simulating a layer which is a Kronecker product of spaced controlled-not gates 
and identity matrices, notice we would at most add one to the color depth at any place. So if a 
controlled-not layer is a composition of O(log) many such layers it will increase the color depth by 
O(log). In the QACC s ^ es case, notice that simulating a single controlled-not we add one new path 
for each existing path through the graph at each of the two heights affected. This gives three new 
paths on the whole subspace for each old one. 

Since we have handled the two possible layer cases and the changes we needed to make only 
increase the resulting tensor graph polynomially, we thus have established the induction step and 
(1) and (2) of the theorem. For (3), observe for each multi-line gate we handle in adding a layer we 
at most quadruple the number of paths through the subspace where that gate applies. Since there 
are at most logarithmically many such gates, the number of paths through the graph increases 
polynomially r~j 

Theorem 4.2 Let {G n } be a family of constant width color- consistent tensor graphs of vectors 
in £\ iP (n)- Assume the coefficients of amplitudes in the {G n } can be encoded in TC^ using our 
encoding scheme described earlier and that {G n } has log-color depth. Then the amplitude of any 
basis vector of £i, p ( n ) * n G n is P/poly computable. If the number of paths through the graph from 
the source to the terminal node is polynomially bounded then the amplitude of any basis vector is 
TC^ 1 computable. 

Proof. Let G n be a particular graph in the family and let be the vector whose amplitude 
we want to compute. Assume that all graphs in our family have fewer than k colors in any color 
product and have a width bounded by w. We will proceed from the source to the terminal node 
one height at a time to compute the amplitude. Since the width is w the number of £i-terms is at 
most w and each of these must have width at most w. Let 0:1,1, • • • , aii,w (some of which may be 
zero) denote the amplitudes in Ag„ of |x n 1) in each of these terms. The a± i are each sums of at 
most w amplitudes times the color products of at most k colors and anticolors, so the encoding of 
these w amplitudes is TC^ computable. Because of the restriction on the width of G n there are at 
most w many £ ij-terms, w 2 many £j j+i-terms, and w many £i j+i -terms. Fixing some ordering 
on the nodes of height j and j + 1 let 7j,j,fc be the amplitude of \x n j+i) in the £jj_|_i-term with 
source the iih node of height j and with terminal node the kth node of height j + 1. The amplitude 
is zero if there is no such ^j+i-term. Then the amplitudes ay+^i, . . . ,atj+i tW of the £ ij+i-terms 
can be computed from the amplitudes ctj t \, . . . , ctj >w of the £ ij-terms using the formula 



w 




1=1 
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Thus Oj+ijjfc can be computed from the ctjj using a polynomial sized circuit to do these adds and 
multiplies. Similarly, each ay ^ can be computed by polynomial sized circuits from the ay_i ^'s 
and so on. Since we have log-color depth the number of terms consisting of elements in our field 
times color products in a ay^ will be polynomial. So the size of the ay^'s j < p(n), k < w will be 
polynomial in the input x n . So the size of the circuits for each ay^ where j < p(n) and k < w will 
be polynomial size. There is only one £i p ( n )-term in G n and its amplitude is that of \x n ), so this 
shows it has polynomial sized circuits. 

For the TC^ result, if the number of paths is polynomially bounded, then the amplitude can 
be written as the polynomial sum of the amplitudes in each path. The amplitude in a path can in 
turn be calculated as a polynomial product of the amplitudes times the colors on the vertical edges 
in the path. Our condition on every color appearing at exactly two heights guarantees the color 
product along the whole path will be 1 or 0, and will be zero iff we get a color and its anticolor on 
the path. This is straightforward to check in TC^ ', so this sum of products can thus be computed 
m TC(°) using Lemma O. r- 1 



Corollary 4.3 

(1) EQACd° g <ZNQACC}° g CP/Poly, and BQ AC 6^ 
^P/poly. 

(2) EQACd^QNQACd^CTdV, and 



Proof. Given 8b 8b family {F n } of QACCjfoperat ors and a family {(-? n |} of states we can use 
Theorem 44 to get a family {G n } of log color depth, color-consistent tensor graphs representing 
the amplitudes of F' 1 ]^). Note {F^ 1 } is also a family of QACCp° g operators since Toffoli and 
fan-out gates are their own inverses, the inverse of any one qubit gate is also a one qubit gate 



(albeit usually a different one), and finally a controlled- not layer is its own inverse. Theorem L2 



shows there is a P/poly circuit computing the amplitude of any vector \x n ) in this graph. This 
amounts to calculating 

If this is nonzero, then | (z n \F n \x n ) | 2 > 0, and we know x is in the language. In the BQACCq case 
everything is a rational so P/ poly can explicitly compute the magnitude of the amplitude and check 
if it is greater than 3/4. The TC^ ^ result follows similarly from the TC^ ^ part of Theorem 4.1. rn 



5 Discussion and Open Problems 

A number of questions are suggested by our work. 

• Is all of NQACC in TC^ or even P/Poly? We conjecture that NQACC is in TC<°). As men- 
tioned in the introduction, we have developed techniques that remove some of the important 
obstacles to proving this. 

• Are there any natural problems in NQACC that are not known to be in ACC? 
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• What exactly is the complexity of the languages in EQACC, NQACC and BQACCq? We 
entertain two extreme possibilities. Recall that the class ACC can be computed by quasipoly- 
nomial size depth 3 threshold circuits It would be quite remarkable if EQACC could 
also be simulated in that manner. However, it is far from clear if any of the techniques used 
in the simulations of ACC (the Valiant- Vazirani lemma, composition of low-degree polyno- 
mials, modulus amplification via the Toda polynomials, etc.), which seem to be inherently 
irreversible, can be applied in the quantum setting. At the other extreme, it would be equally 
remarkable if NQACC and NQTC^ (or BQACCq and NQTC^) coincide. Unfortunately, 
an optimal characterization of QACC language classes anywhere between those two extremes 
would probably require new (and probably difficult) proof techniques. 

• How hard are the fixed levels of QACC? While lower bounds for QACC itself seem impossible 
at present, it might be fruitful to study the limitations of small depth QACC circuits (depth 
2, for example). 

Acknowledgments: We thank Cris Moore for pointing out an error in an earlier version of 



Theorem 4.1, and Bill Gasarch for helpful comments and suggestions. 
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